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Abstract 

Presently, a considerable number of knowledge 
engineering researches have focused on the automatic 
building of ontologies. However, the uncertainty 
of the techniques and eventual heuristics adopted 
during the construction process has led researchers 
to explore methods for verifying and improving the 
quality of the outputs. In this intention, we propose 
a vision for checking the hierarchical structure of 
ontologies based on the WordNet lexical database as 
a background knowledge source. In order to test our 
work, we try to apply our proposed method on an 
existing valid geographic objects ontology. 

Keywords: geographic objects, knowledge model- 
ing, ontology building, Taxonomic structure, similar- 
ity measures, evaluation, quality. 

1 Introduction 

Satellite imagery is a relevant source of information for 
the identification of objects that make up the surface 
of the earth. Their exploitation in a spatio-temporal 
context helps to monitor and predict their behavior 
over time and to take appropriate decisions for the 
management of the environment. 

Indeed, the advent of high-resolution images en- 
abled the development of an object-oriented approach 
where the analysis of a scene is attached to groups 
of pixels representing concrete objects having a spe- 
cific semantic. However, this progress leads to a large 
amount of available information that cannot be pro- 
cessed in its entirety by domain experts. This mo- 
tivated the interest of research on the full or partial 
automation of the process of knowledge representa- 
tion and extraction applied to satellite imagery. To 
use geographic image databases, the researchers used 
several knowledge representation formalisms, in par- 
ticular the ontologies. 



Nowadays, ontologies are becoming very popular 
in the area of knowledge management and sharing, 
especially after the evolution of the Semantic Web. 
They are considered as one of the most powerful tools 
for knowledge representation and reasoning. They 
aim to provide a commonly accepted understanding 
of a specific domain through the generic modeling, 
the exchange and the sharing of its specific knowl- 
edge. Knowledge is modeled in the form of con- 
cepts and their relations to each other. Several stud- 
ies were interested in the use of standardized ontolo- 
gies to share and annotate satellite image information 
[7, 1, 8, 4, 26, 20, 31]. The majority of these works 
presupposes the existence of a domain ontologies that 
may be developed, or be carried out, within the target 
application [4]. However, few studies have focused on 
their evaluation or validation. 

In fact, the quality of an ontology is too sensi- 
tive to many parameters such as the consistency of 
the semantic resources from which it is built and the 
used techniques and heuristics to extract and organize 
relevant knowledge [19]. Therefore, as all engineer- 
ing artifacts, assessing the quality of ontologies still 
remains an important issue for ontology engineering. 
The evaluation covers the structure and the content of 
ontologies and allows to verify several related criteria 
such as their consistency and their adequacy to the 
user’s requirements and pre-established constraints. 

In this paper, our main research question is how to 
examine taxonomic structure of a given geographic ob- 
jects ontology. Firstly, we summarize the main eval- 
uation alternatives. Secondly, we expose our method 
and the related structural measure for verifying the 
ontology hierarchical structure based on the Word- 
Net 1 lexical database. Thirdly, we reserve the last 
section to an experimental study in which we expose 
and interpret the results of the application of our pro- 
posal on a geographic objects ontology. 



1 https:/ /wordnet. princeton.edu/ 



41 



Graphics, Vision and Image Processing, V. 15, No. 2, ISSN 1687-398X, Delaware, USA, December 2015 



2 Ontology evaluation: State of 
the art 

Evaluation is a crucial phase in the building process 
of ontologies. It helps to simplify their development, 
to ensure their relevance to the requirement of a 
particular domain and to detect eventual ontology 
changes. However, the lack of unifying framework for 
methods and metrics for evaluating ontologies have 
led to several trials, each of which defines its own 
method and set of metrics. In this section, we try 
to summarize the main evaluation methods that can 
be classified according to their purpose into three 
categories: ranking, correctness, or quality. 

When trying to reuse the already existing ono- 
tologies for a particular study domain, we are faced 
with the problem of determining the suitable ones 
for our needs. In this context [IT] have presented an 
approach for clustering ontology. The main goal of 
this approach is to use a set of similarity measures 
for comparing ontology-based meta-data. Based on 
this work, [27] have developed the OntoQA approach 
that analyzes ontology schemas and their populations 
and describes them through a well defined set of 
schema and meta-data metrics. The first group 
includes the diagram metrics of ontologies, whose 
intention is to evaluate the ontology design and its 
potential for knowledge representation. The second 
group is interested in evaluating the structure of the 
knowledge base and more specifically how data is 
placed in ontology. 

Further, the ranking category includes approaches 
for ranking and selecting ontologies. These ap- 
proaches allow ranking a set of candidate ontologies 
in order to choose the most appropriate for a par- 
ticular task. Ontometric [15] is one of the main 
used methods for systematic ontology selection, it 
aims to suggest the best ontology for a particular 
project on the basis of 160 properties organized 
in five dimensions of quantitative measurements: 
content, language, methodology, tool and costs. [21], 
have provided a corpus-based method to evaluate the 
functional adequacy of ontologies. [22] have proposed 
an ontology selection and ranking model consisting 
of selection standards and metrics based on better 
semantic matching capabilities. The proposed model 
allows to enhance the ontology selection and rank- 
ing method practically and effectively by enabling 
semantic matching of taxonomy or relational linkage 
between concepts and to identify what measures 
should be used to rank ontologies in a given context 
and what weight should be assigned to each selection 
measure. FOEval [3] is another model which presents 
two main features: first, it enables users to select 
from a set of proposed metrics, those which they 
help in the ontology evaluation process; and to assign 



weights to each one based on assumed impacts on this 
process. Second, it enables users to evaluate locally 
stored ontologies, and/or request search engines for 
available ontologies. The main goal of this model is 
to ease the ontology evaluation task, for users wishing 
to reuse available ontologies, enabling them to choose 
the most adequate ontology to their requirements. 
To evaluate and rank candidate ontologies, FOEval 
use a set of metrics that include: coverage, richness, 
detail- level, comprehensiveness, connectedness and 
computational efficiency. 

The correctness category includes the approaches 
accounting for the formal correctness of the ontologi- 
cal knowledge and used primitives. In this category, 
the best known approach is Ontoclean [12] which is 
designed in order to justify the kinds of decisions 
that experienced ontology builders make and to 
explain the common mistakes of the inexperienced, 
as it analyses the intentional content of concepts. 
It is based on principles of rigidity, identity, unity 
and dependence. Based on this method, [5] have 
developed a framework which looks for taxonomic 
aspects such as circularity and redundancy, as well as 
errors in disjoint groups. [28] have developed another 
tool for evaluating real-world ontologies. [30] have 
proposed a tool that evaluates correctness, where an 
internal evaluation is performed, based on the correct 
usage of OWL primitives. 

The third category addresses the evaluation of the 
global quality of ontology. Following this approach, 
the EvaLexon method [25] aims to evaluate the 
ontologies during their development from texts. It 
measures the most appropriate terms in ontology. 
The relevance of a term is judged by its frequency in 
the text from which the ontology was built and the 
list of terms for a specific domain. The evaluation 
is based on four metrics: precision, recall, coverage 
and accuracy. In turn, [9] have approached the 
ontology evaluation as a diagnostic task based on on- 
tology descriptions, using three categories of criteria: 
structural (depth, breadth, tangledness, dispersion, 
consistency, anonymous classes, cycles, and density), 
functional (competence adequacy, functional modu- 
larity, precision, recall and accuracy), and usability 
profiling (documentation, efficiency, interfacing). 
By combining the different measurable criteria for 
each category, nine quality principles (qoods) are 
defined: cognitive ergonomics, transparency, integrity 
and computational efficiency, meta-level integrity, 
flexibility, expertise compliance, conformity with 
extension, integration and adaptation procedures, 
generic access and organizational ability. 

To assess the quality of evolving ontologies, [16] 
have proposed a set of cohesion metrics that are con- 
sidered as stable, where their results do not depend on 
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the semantic or structural ontology representation. In 
the same way, [6] proposed the Onto-Evoal approach 
which is based on an evaluation model to guide the 
management of inconsistencies by assessing the im- 
pact of proposed resolutions on the content and use 
of the ontology. This model defines a set of quan- 
titative metrics allowing choosing a resolution that 
preserves the quality of the evolved ontology. Qual- 
ity criteria considered in the proposed approach are: 
complexity, cohesion, taxonomy, abstraction, modu- 
larity, completeness and understanding. By referring 
to the work of [10] and [11], [29] presents a theoretical 
framework for assessing the quality of an ontology for 
the Web. The framework summarizes ontology eval- 
uation methods in two dimensions: ontology quality 
criteria (accuracy, adaptability, clarity, completeness, 
computational efficiency, conciseness, consistency, and 
organizational fitness) and ontology aspects (vocabu- 
lary, syntax, structure, semantics, representation, and 
context). Building on the two large meta-properties of 
unity and simplicity, [2] have developed an evaluation 
methodology called OntoAbsolute that allows to as- 
sess the taxonomic and non-taxonomic relationships, 
analyzes the conceptual structure and evaluates the 
ontology as a whole. 

3 Proposed evaluation method 

Our method of analysis of the taxonomic consistency 
of ontologies is based on two key elements (1) the 
projection of the ontology to evaluate on WordNet, 
and (2) the checking of the conformity of its hier- 
archical links compared to those linking WordNet 
corresponding synsets. 

WordNet is an on-line lexical database that lists, 
classifies and connects in various ways the semantic 
and lexical content of a number of languages such 
as English and French [18]. For each word of the 
language, WordNet offers a list of synsets (synonym 
set) that correspond to all its possible meanings. 
The synset is the building block upon which rests 
the entire system. It corresponds to a group of 
interchangeable words denoting one sense or a 
particular purpose. Different words and synsets are 
interconnected by a number of lexical relations as 
the hyponymy/hyperonymy, holonymy /meronymy 
and synonymy/antonymy. These relationships can 
be exploited to explore the exact meaning of a given 
word. Its third release 2 offers a number of 155287 
words expressing 117659 different meanings (synset). 

These values reveal the semantic richness of Word- 
Net and enhances the utility of its use as a reference 
taxonomy in order to verify the structure of ontolo- 
gies. However, its generic nature assign a special 



attention to the polysemy problems. Indeed, for a 
given concept identifier, WordNet has multiple possi- 
ble nodes, each of which is part of a particular context 
and refers to a different signification. Consequently, 
the good location of a concept in WordNet returns to 
find the synset that reflects its exact meaning. 

3.1 Projection of ontology on Word- 
Net 

The aim of this step is to locate the concepts of 
our ontology in WordNet that serves as a reference 
support for the analysis and validation of the ontology 
taxonomic structure. 

For doing this, we are led to find for each 
concept the corresponding WordNet synset. It is 
obvious that this treatment can not be limited to 
a simple term search of the concept identifier in 
WordNet. knowing that the same word can support 
multiple meanings. Therefore, to be able to map a 
given concept in WordNet, we need to distinguish, 
among all proposed synsets, the one that better 
corresponds. Our solution is to involve the context 
of the concept in its marking task in WordNet. 
The context of a concept is described by its iden- 
tifier, labels, comments, neighborhood and properties. 

The most appropriate synset for a given concept is 
the one that shares with it the maximum of knowledge 
in terms of neighborhood and textual descriptions. 

Figure 1: Mapping between ontology concepts and 
WordNet synsets 

Given the following: 

W (F, S ) which defines the vocabulary admitted 
by WordNet corresponding to a set of pairs (F, F), 
where F is the form of a string on a finite alphabet 
and S = {s/F} is the set of senses supported by F. 
s denotes an element of the set of meanings S (i.e. a 
synset). 

Let the function F(c, sQ (Equation 1) defines the 
degree of knowledge sharing between the concept c 
and the synset Si that denotes the synset number i 
of the identifier name of c. The relevant synset to a 
concept c must check this commitment: 

s k = relSyn(c) => P(c, s k ) > P(c, sj ) ^ Sj (1) 

The function P is described by the following algo- 
rithm: 

• syn(c) a function that returns the synsets re- 
lated to the identifier of the concept c. 



2 http : / / wordnet . pr incet on. edu / wordnet / man / wnst at s. 7WN . ht ml 



• lab(c) a function that returns the set of labels of 
the concept c. 



Graphics, Vision and Image Processing, V. 15, No. 2, ISSN 1687-398X, Delaware, USA, December 2015 



• com(c ) a function that returns the comments 
associated with the concept c. 

• super (c) a function that returns the direct sub- 
sumer of the concept c. 

• w _corrt(s) a function that returns the significant 
words included in the comments associated with 
the concept c. 

• w_syn(s ) a function that returns the set of syn- 
onyms words related to the synset s. 

• w _gloss(s) a function that returns the signifi- 
cant words that compose the definition associ- 
ated to the synset s. 

and 



3.2 Validation of ontology taxonomic 
structure 

Once the concepts of the ontology to be evaluated are 
mapped with the WordNet synsets, it is now possible 
to check the compatibility between the taxonomic 
structure of the ontology and that of corresponding 
synsets. 

The hypothesis on which we base our assessment 
is that a given subsomption relationship between two 
concepts is considered valid only if their correspond- 
ing synsets are connected by the shortest hyperonymy 
path compared to those linking the synset of the 
subsuming concept to all synsets associated with the 
other concepts. 



For extracting the relevant synset to the root con- 
cept of our ontology, we can proceed as follows : 

• If the concept has a single synset in WordNet, 
it is then the corresponding synset. 

• If the concept has labels, the sharing degree be- 
tween it and a given synset is described by the 
intersection of their respective labels and syn- 
onyms. 

• If the concept has comments, the sharing de- 
gree between it and a given synset is described 
by the intersection of their respective comments 
and definitions. 

• Otherwise, the selection can be done manually 
(only for the root concept). 

Input : c (a concept name identifier) 

BEGIN 

if (I syn(c) |= 1) 

P = 1 

else if (| lab(c) |> 0) 

P =\ lab(c) fl w_syn(s) | / | lab(c) \ 
else if (| com(c) |> 0) 

P =\ w _com(c) fl w _gloss(s) | / | w _com(c) | 
else return -1 (Cannot locate the root c in WordNet) 

END. 

However, the identification of the corresponding 
synset for a given non-root concept c is based on 
the computation of the distance that separates this 
synset to that associated with the closest subsumer 
of c in the ontology to be evaluated. We assume 
that the relevant synset Sk to a given concept is the 
one that is connected with the smallest number of 
subsumption links; among other synsets Si of the 
same concept; to the corresponding synset of its 
subsumer. In this situation, the degree of knowledge 
sharing is described by the formula 2. 



P(c, Si) 



1 

distance(si, relSyn(super(c )) 



( 2 ) 



Several graph-theoretic measures can be used to 
calculate the proximity between two synsets in Word- 
Net. They are mainly based on the number of edges 
that separate two nodes in a taxonomy. The most 
commonly used measures in literature are Rada [23], 
Leacock & Chodorow [14], Hirst & St-Onge [13] and 
Wu & Palmer [32] . Rada measure is considered as the 
most obvious way to evaluate the semantic similarity 
in a hierarchical ontology. It corresponds to the short- 
est path between two concepts in an ontology where 
only taxonomic links are considered, i.e. hyperonymy 
and hyponymy. Leacock & Chodorow measure is an 
extension of Rada which is in fact normalized by intro- 
ducing a division by the maximum hierarchy depth of 
the involved concepts. Path measure adopts the same 
principle as the previous two measures by considering 
the inverse of the number of nodes along the shortest 
path between two nodes. As for Hirst & St-Onge mea- 
sure, similarity between two concepts is determined by 
the minimum number of direction changes of the path 
between the two concepts. Indeed, depending on this 
measure, we distinguish four relation types between 
two concepts which are extra-strong, strong, medium 
and week. Wu & Palmer measure evaluates the sim- 
ilarity between two concepts as the distance of their 
most specific common subsumer to the root of the on- 
tology divided by the shortest path between them. 



4 Experimentation 

In this section, we expose and interpret the results of 
applying our proposed evaluation method on a part 
of the ontology of AKTtiveSA 3 . This ontology deals 
with a number of geographical aspects of the knowl- 
edge infrastructure for humanitarian and disaster re- 
lief operations. It encompasses a wide variety of con- 
ceptualizations including terrain features, transport 
routes, rivers, shorelines, terrain elevation data, etc. 
[24]. The part to which we will limit our experimen- 



3 http: / /www. zaltys.net/ontology/AKTiveSAOntology. owl 
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tal study represents a hierarchy of 23 concepts model- 
ing some Earth hydrographic objects (Table 2). Our 
scope of analysis will be restricted to concepts whose 
names appear in WordNet. 

Figure 2: Taxonomic structure of the AKTiveSA on- 
tology 

As indicated above, the rapprochement between 
the concepts of the ontology to evaluate and the Word- 
Net synsets may be supported by the texts associated 
with them, but also by their subsumers in both hier- 
archies. The concepts of our ontology lack any label. 
Table 1 shows the associated comments for each con- 
cept of the analyzed part. 



Table 1: AKTiveSA concepts and relative comments 



Concept 


Comments 


Body of 

water 


Represents planetary structures that are 
part of the hydrosphere and that have a 
primary substance composition of a water. 


Aquifer 


An aquifer is an underground structure of 
water-bearing, permeable rock. 


Reservoir 




Pond 


A pond is a body of water smaller than 
a lake. However the difference between a 
pond and a lake is largely subjective. The 
term pond usually describes small bodies 
of water, generally smaller than one would 
require a boat to cross. Another definition 
is that a pond is a body of water where 
even its deepest areas are reached by sun- 
light. 


Lake 


A lake is a body of water surrounded by 
land. 


Stream 


A stream is a body of water with a de- 
tectable current, confined within a bed 
and banks. Stream is also an umbrella 
term used in the scientific community for 
all flowing natural waters. 


River 


A river is a large stream, which may also 
be a water way. 


Canal 


Canals are man-made waterways, usu- 
ally connecting existing lakes, rivers, or 
oceans. Irrigation canals are man-made 
waterways for the delivery of water and 
preceded the use of transportation canals 
used by barges or narrowboats on smaller 
canals, and by ships on ship canals that 
connect to the ocean. 


Creek 


In British English and Indian English us- 
age, a creek is a tidal water channel. 
Creeks may often dry to a muddy channel 
with little or no flow at low tide, but of- 
ten with significant depth of water at high 
tide. 


Spring 


A spring is a point where groundwater 
flows out of the ground, and is thus where 
the aquifer surface meets the ground sur- 
face. 


Ocean 


A large body of water constituting a prin- 
cipal part of the hydrosphere. 



The results of the evaluation of the structural 
proximity between each concept and each of its cor- 
responding synsets are given in Table 2. For each of 
these concepts, we indicate the Path similarity be- 
tween each of its related synsets and the synset that 



corresponds to its closest subsumer concept in the 
AKTiveSA ontology. The most relevant synset for 
a given concept is that having the highest Path simi- 
larity value (written in bold). 



Table 2: The taxonomic proximity values between 
concepts and related synsets. 



concept 


# n#l 


#n#2 


#n#3 


#n#4 


#n#5 


#n#6 


Pond 


0.33 


- 


- 


- 


- 


- 


Aquifer 


0.16 


- 


- 


- 


- 


- 


Lake 


0.5 


0.11 


0.11 


- 


- 


- 


Stream 


0.5 


0.09 


0.08 


0.11 


0.08 


- 


Ocean 


0.5 


0.11 


- 


- 


- 


- 


Reservoir 


0.1 


0.5 


0.35 


0.25 


- 


- 


Canal 


0.25 


0.12 


0.10 


- 


- 


- 


Creek 


0.5 


0.11 


- 


- 


- 


- 


River 


0.5 


- 


- 


- 


- 


- 


Spring 


0.09 


0.09 


0.14 


0.11 


0.09 


0.08 



Excepting the case of the concept Canal , the con- 
frontation of the results detailed in Table 2 with the 
definitions of the concepts (Table 1) and their rele- 
vant synsets (Table 4) proves the effectiveness of our 
approach to identify concepts in WordNet. We clearly 
notice that the definitions of the AKTiveSA concepts 
are highly compatible with the glosses related to found 
synsets. Furthermore, the localization of concepts in 
WordNet helps to better understand their contexts. 
The example in Table 3 reinforces this idea and shows 
that, among the three synsets related to the concept 
Canal , the first synset has the highest structural simi- 
larity compared to representative sysnets of the other 
concepts. 



Table 3: The Path similarity between the synsets of 
canal and the other concept synsets 





b. water 
#n#l 


aquifer 

#n#l 


reservoir 

#n#2 


pond 

#n#l 


lake 

#n#l 


canal#n^=l 


0.33 


0.12 


0.20 


0.20 


0.25 


canal#n#2 


0.14 


0.10 


0.11 


0.11 


0.12 


canal#n^=3 


0.11 


0.12 


0.09 


0.09 


0.10 




stream 

#n#l 


river 

#n#l 


creek 

#n#l 


spring 

#n#3 


ocean 

#n#l 


canal#n^l 


0.25 


0.20 


0.20 


0.12 


0.25 


canal#n^2 


0.12 


0.11 


0.11 


0.10 


0.12 


canal#n^=3 


0.10 


0.09 


0.09 


0.12 


0.10 



On the other hand, by browsing through the 
glosses of the three synsets of Canal , it seems clearly 
that the definition }} long and narrow strip of water 
made for boats or for irrigation M associated with the 
third synset is more appropriate to the context of 
geographic objects than the first synset. A simple 
computation of the terminological intersection be- 
tween the concept comments and both associated 
synset glosses can reinforce this attitude and allows 
us to conclude that this definition is the closest to 



